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Encoding for Internet Messages 


It does 
Distribution of this memo is 


[RFC822] 
messages in several Japanese networks. It 
used in JUNET [JUNET]. The encoding is now 


also widely used in Japanese IP communities. 


The name given to this encoding is 
to be used in the "charset" parameter field of MIME headers 


[MIME1] and [MIME2]). 
Description 
The text starts in ASCII 
(three bytes, 


encoded in two bytes each. 
sequence ESC ( B is used. 


[ASCII], 
through an escape sequence. 
hexadecimal values: 
following this escape sequence are Japanese characters, 


which is intended 
(see 


WESO2022JP 


and switches to Japanese characters 
For example, the escape sequence ESC $ B 

1B 24 42) indicates that the bytes 

which are 


To switch back to ASCII, the escape 


The following table gives the escape sequences and the character sets 


used in ISO-2022-JP messages. 


The ISOREG number is the registration 


number in ISO’s registry [ISOREG]. 


Esc Seq Character Set ISOREG 
ESC ( B ASCII 6 
ESC ( J JIS X 0201-1976 ("Roman" set) 14 
ESC $ @ JIS X 0208-1978 42 
ESC $ B JIS X 0208-1983 87 
Note that JIS X 0208 was called JIS C 6226 until the name was changed 
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on March 1st, 1987. Likewise, JIS C 6220 was renamed JIS X 0201. 


The "Roman" character set of JIS X 0201 [JISX0201] is identical to 
ASCII except for backslash () and tilde (7). The backslash is 
replaced by the Yen sign, and the tilde is replaced by overline. This 
set is Japan’s national variant of ISO 646 [ISO646]. 


The JIS X 0208 [JISX0208] character sets consist of Kanji, Hiragana, 
Katakana and some other symbols and characters. Each character takes 
up two bytes. 


For further details about the JIS Japanese national character set 
standards, refer to [JISX0201] and [JISX0208]. For further 
information about the escape sequences, see [ISO2022] and [ISOREG]. 


If there are JIS X 0208 characters on a line, there must be a switch 
to ASCII or to the "Roman" set of JIS X 0201 before the end of the 


line (i.e., before the CRLF). This means that the next line starts in 
the character set that was switched to before the end of the previous 
line. 


Also, the text must end in ASCII. 
Other restrictions are given in the Formal Syntax below. 
Formal Syntax 


The notational conventions used here are identical to those used in 
RFC 822 [RFC822]. 


The * (asterisk) convention is as follows: 
1*m something 


meaning at least 1 and at most m somethings, with 1 and m taking 
default values of 0 and infinity, respectively. 


message = headers 1*( CRLF *single-byte-char *segment 
single-byte-seq *single-byte-char ) 
; see also [MIME1] "body-part" 
; note: must end in ASCII 
headers = <see [RFC822] "fields" and [MIME1] "body-part"> 
segment = single-byte-segment / double-byte-segment 


single-byte-segment = single-byte-seq 1*single-byte-char 
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double-byte-segment = double-byte-seq 1*( one-of-94 one-of-94 ) 


single-byte-seq = ESG Mr CNB fw) 
double-byte-seq = ESC "$" ( "@e" / "B" ) 
CRLF = CR LF 

; ( Octal, Decimal.) 
ESC = <ISO 2022 ESC, escape> 7 ( 33; 27.) 
SI = <ISO 2022 SI, shift-in> ze ef 17, 15.) 
SO = <ISO 2022 SO, shift-out> Care| 16, 14.) 
CR = <ASCII CR, carriage return>; ( 15: 13.) 
LF = <ASCII LF, linefeed> 7 ( 12, 10.) 
one-of-94 = <any one of 94 values> ; (41-176, 33.-126.) 
7BIT = <any 7-bit value> e Ae HOL, 0.-127.) 


single-byte-char <any 7BIT, including bare CR & bare LF, but NOT 


including CRLF, and not including ESC, SI, SO> 
MIME Considerations 


The name given to the JUNET character encoding is "ISO-2022-JP". This 
name is intended to be used in MIME messages as follows: 


Content-Type: text/plain; charset=is0-2022-jp 


The ISO-2022-JP encoding is already in 7-bit form, so it is not 
necessary to use a Content-Transfer-Encoding header. It should be 
noted that applying the Base64 or Quoted-Printable encoding will 
render the message unreadable in current JUNET software. 


IS0-2022-JP may also be used in MIME Part 2 headers. The "B" 
encoding should be used with ISO-2022-JP text. 


Background Information 


The JUNET encoding was described in the JUNET User’s Guide [JUNET] 
(JUNET Riyou No Tebiki Dai Ippan). 


The encoding is based on the particular usage of ISO 2022 announced 
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by 4/1 (see [ISO2022] for details). However, the escape sequence 
normally used for this announcement is not included in ISO-2022-JP 
messages. 


The Kana set of JIS X 0201 is not used in ISO-2022-JP messages. 


In the past, some systems erroneously used the escape sequence ESC ( 
H in JUNET messages. This escape sequence is officially registered 
for a Swedish character set [ISOREG], and should not be used in ISO- 
2022-JP messages. 


Some systems do not distinguish between ESC ( B and ESC ( J or 
between ESC $ @ and ESC $ B for display. However, when relaying a 
message to another system, the escape sequences must not be altered 
in any way. 


The human user (not implementor) should try to keep lines within 80 
display columns, or, preferably, within 75 (or so) columns, to allow 
insertion of ">" at the beginning of each line in excerpts. Each JIS 
X 0208 character takes up two columns, and the escape sequences do 
not take up any columns. The implementor is reminded that JIS X 0208 
characters take up two bytes and should not be split in the middle to 
break lines for displaying, etc. 


The JIS X 0208 standard was revised in 1990, to add two characters at 
the end of the table. Although ISO 2022 specifies special additional 
escape sequences to indicate the use of revised character sets, it is 
suggested here not to make use of this special escape sequence in 
ISO-2022-JP text, even if the two characters added to JIS X 0208 in 
1990 are used. 


For further information about Japanese character encodings such as PC 
codes, FTP locations of implementations, etc, see "Electronic 
Handling of Japanese Text" [JPN.INF]. 
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Security Considerations 


Security issues are not discussed in this memo. 
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